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Abstract 

Wolbachia are common endosymbionts of terrestrial arthropods, and are also found in nematodes: the animal-parasitic 
filaria, and the plant-parasite Radopholus similis. Lateral transfer of Wolbachia DNA to the host genome is common. We 
generated a draft genome sequence for the strongyloidean nematode parasite Dictyocaulus viviparus, the cattle lungworm. 
In the assembly, we identified nearly 1 Mb of sequence with similarity to Wolbachia. The fragments were unlikely to derive 
from a live Wolbachia infection: most were short, and the genes were disabled through inactivating mutations. Many 
fragments were co-assembled with definitively nematode-derived sequence. We found limited evidence of expression of 
the Wolbachia-der'wed genes. The D. viviparus Wolbachia genes were most similar to filarial strains and strains from the host- 
promiscuous clade F. We conclude that D. viviparus was infected by Wolbachia in the past, and that clade F-like symbionts 
may have been the source of filarial Wolbachia infections. 
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Introduction 

Wolbachia are alphaproteobacterial, intracellular symbionts of 
many non-vertebrate animal species, related to rickettsia-like 
intracellular pathogens such as Anaplasma and Ehrlichia [1]. 
Wolbachia was first detected as a cytoplasmic genetic element 
causing mating type incompatibilities in Culex pipiens mosquitoes, 
and subsequently has been found to infect many insect species [2] . 
In insects, most Wolbachia can be classified as reproductive 
parasites, as they manipulate their hosts' reproduction to promote 
their own transmission [3]. This is achieved by induction of mating 
type incompatibilities, induction of parthenogenesis in females of 
haplo-diploid species, and killing or feminisation of genetic males. 
In some insects, Wolbachia infections are apparently "asymptom- 
atic", in that no reproductive bias has been detected. There is 
evidence that Wolbachia infection can be beneficial to hosts, 
particularly in protection from other infectious organisms [4]. 
Importantly, in most insect systems tested the symbiosis is not 
essential to the hosts, which can be cured by antibiotic treatment. 

Wolbachia strains have been classified into a number of groups 
using molecular phylogenetic analyses of a small number of 
marker loci [5,6] . Insect Wolbachia largely derive from clade A and 
B. Outside Insecta, arthropod Wolbachia infections have been 
identified in terrestrial GoUembola (Hexapoda), Isopoda (Crusta- 
cea), Chelicerata and Myriapoda, and also in marine Amphipoda 
and Cirripeda (Crustacea). Most non-insect arthropod infections 
also involve Wolbachia placed in clades A or B. A minority of 
arthropod infections involves Wolbachia placed in distinct lineages 
(clades E through N) [5,7]. In clade A and B symbionts, 
transmission appears to be essentially vertical (mother to offspring) 



in ecological time, but phylogenetic analysis reveals that lateral 
transfer between hosts has been common on longer timescales. 

Wolbachia infections have also been identified in nematodes, 
notably in the animal parasites of the Onchocercidae. These 
filarial parasites utilise arthropod vectors (dipterans and chelice- 
rates) in transitioning between their definitive vertebrate hosts, but 
the Wolbachia they carry are not closely related to those of the 
vector arthropods. The majority of Wolbachia from onchocercid 
nematodes are placed in two distinct but related clades, C and D 
[6,8]. The biology of the interaction between filarial nematodes 
and their C and D Wolbachia is strikingly different [9] . There is no 
evidence of reproductive manipulation. Transmission is vertical, as 
in other Wolbachia, but, unlike the arthropod symbionts, in species 
with infections all members carry the symbionts, and the 
phylogeny of hosts and symbionts show remarkable congruence. 
Treatment with antibiotics both kills onchocercid nematode 
Wolbachia, and also affects the viability of the nematodes, 
suggesting a strongly mutualistic, possibly essential interaction 
[10,11]. The interaction is not essential on a phylogenetic 
timescale, as nested within the Wolbachia-'mi^ct^d onchocercids 
are species that have lost their infections [12]. The biological bases 
for the mutualism is a topic of significant research interest, and 
may include manipulation of embryogenesis, metabolic provision- 
ing and modulation of host immune responses [9,13-16]. 

Not all nematode Wolbachia are placed in clades C and D [17]. 
Clade F Wolbachia have a distinct host profile compared to the 
other clades, as they have been found in both onchocercid 
nematodes {Mansonella, Madathamugadia and Cercopithifilaria species) 
[18], and arthropods (hexapods and chelicerates). The Wolbachia 
symbiont from the nematode Dipetalonema gracile is the sole 
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Author Summary 

Bovine lungworms are economically important nematode 
parasites of cattle. We have sequenced the genome of the 
bovine lungworm to provide information for drug and 
vaccine discovery. Within the lungworm genome we found 
extensive evidence of an ancient association between the 
lungworm and a bacterium called Wolbachia. The lung- 
worm Wolbachia is now a "fossil" in the genome, but tells 
of an ancient infection. Association between lungworms, 
and related nematode worms, and Wolbachia was not 
known previously. We have used the lungworm Wolbachia 
sequence to explore the history of nematode-l/l/o/6c7c/?/c7 
interactions, particularly the jumping of these symbionts 
between arthropods and nematodes. 

representative of clade J, but is closely related to clade C Wolbachia 
[19]. A Wolbachia infection has been described in Radopholous similis, 
a tylenchid plant parasitic nematode distantly related to the 
Onchocercidae [20]. This symbiont has been placed in a new 
clade I. The biological role(s) of these nematode Wolbachia have yet 
to be defined. Wolbachia have been sought in other nematode 
species, both parasitic and free-living. These searches, carried out 
using Wolbachia-specific PGR amplification of marker genes, have 
generally proved negative in individuals sampled across the 
diversity of Nematoda other than Onchocercidae [21]. In the 
many ongoing nematode genome and transcriptome projects, 
Wolbachia-derived sequence has only been described from 
onchocercid nematodes and R. similis. However, there are two 
overlapping expressed sequence tags from Ancylostoma caninum (also 



a member of Strongyloidea) that have high similarity to Wolbachia 
genes [22], but these have not been verified as derived from a 
Wolbachia symbiont in this species. (The relationships of the 
nematode taxa discussed are illustrated in Figure 1 [18,23,24].) 

Lateral transfer of Wolbachia genome fragments into the host 
nuclear genome has been detected in arthropods and nematodes 
that carry live infections [25,26]. Inserted fragments range from 
what is likely the whole bacterial genome inserted into an azuki 
beetle chromosome, to short fragments at the limit of specific 
detection. These fragments have excited much debate, particularly 
concerning the Onchocercidae, where it has been hypothesised 
that they may represent functional gene transfers into the 
nematode genome and thus play significant roles in host biology 
[27-30]. However most Wolbachia insertions have accumulated 
many substitutions and insertion-deletion events compared to their 
functional homologues in extant bacterial genomes. In this they 
most resemble nuclear insertions of mitochondrial DNA, which 
are 'dead on arrival' and evolve neutrally in the host chromosome 
[25]. 

Interestingly, the onchocercid nematodes Onchocerca Jlexuosa [31], 
Acanthocheilonema viteae [11,32] and Loa loa [12] lack Wolbachia 
despite their placement within the group of Wolbachia-cont?iming 
species. This suggests that they have lost their live Wolbachia 
infections. Fragments of Wolbachia-like sequence have been 
detected in the nuclear genome in these species [31,33]. Wolbachia 
nuclear transfers, or nuwts, in nematodes that currently lack live 
Wolbachia infection can be thought of molecular fossils of the 
previous symbiosis history of the host. Just as fossil skeletal remains 
can reveal the past distribution of larger biota, and viral insertions 
reveal the history of host infection [34,35], nuwts can reveal past 
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Figure 1. Relationships of nematode species harbouring Wo/bach/a symbionts. A phylogenetic cartoon showing the relationships of the 
nematode species discussed in this work [23]. To the left, the systematic structure of the class Chromadoria is given, and the three major suborders 
within Rhabditida are highlighted. Lifecycle strategies of the groups are indicated. The fine-scale relationships of species discussed in the text are 
given to the right. The presence of live Wolbachia infection (+: yes, -: no), evidence of laterally-transferred Wolbachia sequences in the nuclear 
genome {+: yes, -: no, ?: unknown), and the availability of complete genome sequences {+: yes, -: no, ±: partial genome sequence) for each of the 
species are indicated. 
doi:1 0.1 371 /journal.pgen.1 004397.g001 
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Table 1. Assembly statistics for the Dictyocaulus viviparus 
nuclear genome and the Wolbachia-Wke insertions. 







D. viviparus nuclear 
genome * 


Woibacfiia-Wke 

TiaQiTieniS 


number of reads (million) 


165 




span of data (Gb) 


16 




span of assembly (Mb) 


169.4 


1.0 


number of contigs 


17,715 


193 


N50 length (bp) 


22,560 


10,017 


mean read coverage 


84.53 


119.06 


GC% 


34.5 


34.9 



^ The D. viviparus mitochondrial genome was assembled in four contigs, with 
mean coverage —10,000 fold. The four contigs were aligned to the published D. 
viviparus mitochondrion genome and cover the entire span. 

Fragment lengths were added as full contigs if no nematode-like sequence 
was detected. If the contig contained nematode sequences, only the range of 
the Wolbactiia BLAST hits was added. 
doi:1 0.1 371/journal.pgen.1 004397.t001 



symbioses, and their divergence from current Wolbachia genomes 
can be used to estimate the date of the symbiosis. 

We are engaged in a phylum-wide survey of genomes within the 
Nematoda [36] . As part of our analytic procedures we routinely 
screen raw genomic DNA data for contamination with environ- 
mental, commensal and host DNAs with a pipeline that uses read 
coverage, contig GC% and sequence identity to known protein 
sequences [37]. This serves to identify, and ease removal of, 
contaminating genomes, which in turn improves target genome 
assembly and aids independent assembly of symbiont genomes 
where present. Here we present an analysis of genome sequence 
data from the strongyloidean nematode Dictyocaulus viviparus, the 
bovine lungworm, which reveals molecular fossils of an ancient 
Wolbachia symbiosis in this economically important species, which 
is only distantly related to the previously known nematode hosts 
(Figure 1). 

Results 

The draft genome sequence of Dictyocaulus viviparus 

We generated a draft genome for the strongyloidean nematode 
D. viviparus based on a single adult male specimen provided from a 
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Figure 2. Comparison of the Dictyocaulus viviparus proteome to 
that of other rhabditid nematodes. Venn didgram iiiustrdting the 
orthoMCL clustering of the predicted proteome of Dictyocaulus 
viviparus (DVI) to those of Caenorhabditis elegans (CEL), Heterorhabditis 
bacteriophora (HBA) and Haemonchus contortus (HCO). The numbers of 
proteins clustered and the total number of predicted proteins is given 
below each species' name. 
doi:1 0.1 371/journal.pgen.1 004397.g002 

cow slaughtered at an abattoir in Ngaoundere, Cameroon. The D. 
viviparus genome was assembled using Velvet from 1 6 gigabases of 
cleaned data from 165 million, 100-base, paired-end reads from a 
500 base pair (bp) insert library sequenced on an lUumina 
HiSeq2500 instrument. The draft assembly spanned 169.4 
megabases (Mb) (Table 1). In terms of contiguity, the draft was 
of moderate quality with an N50 (length of contig at which 50% of 
the genome is in contigs of this size or larger) of 22 kilobases (kb), 
and N90 of 5 kb. There were 17,715 contigs above 500 bp. The 
assembly had a GC content of 34.5% and estimated read coverage 
of —80 fold (Figure 2A). The mitochondrial contigs from the 
assembly had >99.5% identity to the published mitochondrial 
genome of Z). viviparus. The size of this draft assembly is within the 
range of published genome sizes from species of the same suborder 
(Rhabtitina), which range from 80 Mb {Heterorhabditis bacteriophora 
[38]) to 320 Mb {Haemochus contortus [38]) (Table 2). Given that we 
used a single library, and had no long-range mapping data, it is 
likely that this genome size estimate is lower than the true genome 
as near-identical repeats will have been collapsed or left 
unassembled. We assessed the completeness of the draft assembly 



Table 2. Genome assembly and annotation metrics of D. viviparus and other Rhabditina species. 



Species 


Dictyocaulus viviparus 


Caenorliabditis elegans 


Haemonchus contortus 


Heterorhabditis bacteriophora 


Assembly size (Mb) 


169.4 


100.3 


368.8 


77.0 


Number of contigs >500 bp 


17715 


6 


19728 


1259 


Mean contig length >500 bp 


9561 


14326628 


18696 


61164 


N50>500 bp 


22560 


1 7493829 


83501 


312328 


GC 


34.5 


35.4 


43.1 


33.3 


Number of N's (Mb) 


0.5 


0 


23.6 


2.6 


Predicted genes 


14306 


20520 


21276 


14667 


Median protein length (bp) 


834 


1017 


900 


423 


Median exon length (bp) 


168 


146 


109 


94 


Median exons per gene 


7 


5 


7 


4 


Reference 


this work 


[43] 


[44,45] 


[38] 


doi:1 0.1 371/joumal.pgen.1 004397.t002 
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Figure 3. Wolbachia sequence in a Dictyocaulus viviparus genome assembly. A. Taxon-annotated GC%-coverage plot of the primary D. 
viviparus genome assembly, with contigs that have significant matches to Wolbachia proteins highlighted in red. A total of 193 contigs spanning 
1 Mb (out of a total assembly span of 169 Mb) had significant similarity to Wolbachia. B. Circos plot comparing the 25 longest of the D. viviparus 
genome contigs that contained i/l/o/6ac/?/a-like sequence to the genome of the Wolbachia endosymbionts of the filarial nematode Brugia malayi 
(n/Bm) [9] and Onchocerca ochengi {wOo). The arcs show BLASTn-derived matches between the contigs and the genome sequences. Transcripts from 
D. viviparus mapped to the assembly are reported as green lines in the outer circle of the figure. C Frequency histogram illustrating the different 
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patterns of coverage of the Wolbochio-Wke scaffolds (black) compared to the nuclear genome scaffolds (green). D. Frequency plot of similarity of D. 
viviparus l/l/o/fc)C7c/?/c7-like sequences to wBm (blue) and wOo (the Wolbachia endosymbiont of the filarial nematode Onchocerca ochengi) (red). Each D. 
viviparus Wolbachia-Wke segment was split into 500 bp fragments, and the best percentage identity with the reference genomes calculated using 
BLASTn. E. The Wolbachia-Wke fragments identified in the D. viviparus genome assembly are co-assembled with nematode genes, and have 
accumulated multiple inactivating mutations. Two putative Wolbachia insertions in nuclear contigs are shown in views derived from the gBrowse 
genome viewer. Each panel shows (from top to bottom) the whole scaffold with the zoomed-in region highlighted, the GC% plot for the scaffold, the 
scale for the zoomed-in region, the read coverage for the zoomed-in region, the genes called by RAST in the zoomed in region and the genes called 
by AUGUSTUS in the zoomed-in region. The upper plot shows scaffold00357 while the lower plot shows scaffold00506. 
doi:1 0.1 371 /journal. pgen.1 004397.g003 



using the Gore Eukaryotic Genes Mapping Approach (GEGMA 
[39]), and identified 90% complete and 93% partial genes. A 
previous Roche 454 transcriptome assembly for D. viviparus [40] 
was used to assess the assembly's completeness in terms of 
representation of known D. viviparus transcripts. Retaining matches 
where over 70% of the transcript were mapped to the same 
genome contig, 87% of transcripts were present in the assembly. 
Many additional transcripts were split between contigs. 

Using a MAKER2 -Augustus pipeline [41,42], we predicted 
14,306 protein-coding genes, with a median length of 834 bp, 
median exon length of 168 bp, and a median of 7 exons per gene. 
We compared this predicted gene set for D. viviparus to those of 
Caenorhabditis elegans [43], H. bacteriophora [38] and H. contortus 
[44,45] using orthoMGL [46]. A majority (75%) of the predicted 
D. viviparus proteins clustered with proteins from these rhabditine 
nematodes (Figure 2). The only species which had a low 
proportion of proteins clustered was H. bacteriophora (—40%), an 
observation that has been noted previously [38]. 

Thus, while the goal of our study was not to produce a high- 
quality reference genome for D. viviparus, the draft assembly and 
annotation produced are still of reasonable quality (Table 2). A 
majority of known D. viviparus genes are present, similarity to 
related nematode species is high, and most of the genes appear to 
be present and in full length. The genome assembly and a 
dedicated BADGER genome exploration environment [47] are 
available from http://dictyocaulus.nematod.es. 

Identification of Wolbachia-Wke sequences in the nuclear 
assembly 

As part of our standard quality control processes, we generated 
a taxon-annotated GG -coverage plot (TAGG plot) [37], with the 
goal of identifying any non-nematode (either bovine host or 
environmental bacterial) contamination (Figure 3 A). This process 
allows identification of contaminants by their presence as contigs 
with differing GG content or estimated read coverage compared to 
that of assured target genome contigs [48]. The taxonomic 
annotation, using the NGBI BLAST+ suite, serves to assign 



contaminant contigs to their possible species of origin. This process 
identified a total of 193 contigs, spanning 1 Mb, that had best 
matches to Wolbachia (Figure 3 B). The Wolbachia-like contigs had a 
GG content very close to the mode for the nematode genome, but 
they had a wide range of estimated coverages, from approximately 
equal to the majority of nematode-derived contigs to 3-4 fold 
higher Figure 3 G). Unusually, the Wolbachia-like contigs were not 
better assembled than the nuclear genome. The lower complexity 
of the alphaproteobacterial genome usually results in more 
contiguous assembly, even at low coverage. 

The putative Wolbachia from D. viviparus (^x^Dv) contigs were 
compared to the complete genomes of Wolbachia from Brugia malayi 
(2x;Bm) [9] and 0. ochengi [wOo) [16]. The average identity of the 
BLAST hits was 84.5% ±3.2% to both of the other Wolbachia 
genomes, indicating similar evolutionary distance from these two 
taxa (Figure 3 D). The matches were distributed across the genomes 
of other Wolbachia (Figure 3 B). The Wolbachia-like fragments were 
uploaded to the RAST server [49] for direct annotation, and 1580 
coding sequences were predicted, almost double than found in 
previous nematode Wolbachia genomes (http://rast.nmpdr.org/ 
?page =JobDetails&job = 112231; Table 3). This elevated number 
largely resulted from frameshifts and stop codons in the middle of 
genes, which fragmented the open reading frames, and overall only 
567 different Wolbachia genes (of a usual 800 to 1500) were 
identified. We also screened the contigs that had Wolbachia matches 
for other informative similarities, and identified 29 that contained 
both nematode and Wolbachia matches (examples are illustrated in 
Figure 3 E). We explored both read coverage and read-pair sanity 
across these 29 contigs using Tablet [50] to validate the co-assembly 
of nematode and Wolbachia-like segments, as de Bruijn graph 
assemblers can create chimaeric contigs. We found the contigs to be 
valid, contiguous regions of the genome. Even in cases such as 
scaffold00357 (Figure 3 E) where the nuclear and Wolbachia 
components had distinct read coverages, manual inspection of the 
presumed Wolbachia-nucleai junctions revealed no issues of incon- 
sistent read pairing or inferred insert length. Segments with much 
higher coverage than the nuclear genome may be derived from 



Table 3. Putative Wolbachia-Wke open reading frames identified in the Dictyocaulus viviparus nuclear genome. 





Feature 


Value 


Comment 


Number of open reading frames (ORFs) * 


1580 




Mean ORF length (bp) 


729 ± 703 


In wBm the mean length is 859±712 bp 


Distinct Woibaciiia genes identified ** 


567 


These are present in 1033 ORFs. 547 ORFs had no similarity to other 
Wolbachia genes. 


Genes identified in only 1 ORF 


318 


134 had <70% coverage; 79 of these genes are not present in wBm 


Genes identified in more than 1 ORF 


249 


Mean number of ORFs per gene identifier =2.9; SD = 1.4 



^Predicted using RAST. The RAST analysis of the Wolbachia-Wke fragments from D. viviparus is available on the RAST server at http://rast.nmpdr.org/ 
?page = JobDetails&job = 1 1 2231 . 

^^These are genes identified by RAST as being similar to genes identified in other Wolbachia genomes. Some genes are present in multiple, distinct copies in the D. 
viviparus assembly. 

doi:1 0.1 371/journal.pgen.1 004397.t003 
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Consensus 

1. Dv_Moredun_wl6S 

2. nDv.l.0.scaf01523_wl6S 

3. nDv.l.0.scaf09320_wl6S_l 

4. nDv.l.0.scaf09320_wl6S_2 



AGAl 



IrriFn Mifir Tm TTiiMiM'-TrrigM Tcmrrn TjUniirM TTnlTrir larr, TTMjjM mMTTBm Trj-inaTanm 



AGACCCGAH-ACGTATTCACCGTGGCATGCTGATCCACGATTACTAG-- CGATTCCAACTTCATGCACTCGA 
AGA C C CGAGAA CG TA T TCA C CG TGG CA TGCTGATCCACGAT TAG TAG -- CGATTCCAAC TTCATGCAC TOGA 
AGACCCGAHAACGTATTCACCGTGGCATGCTGATCCACGATTACTAG-- CGATTCCAACTTCATGCACTCGA 

agacccgaHaacgtattcaccgtggcatgctgatccacgattactagHgcgattccaacttcatgcaHtcga 



Consensus 

1. Dv_Moredun_wl6S 

2. nDv.l.0.scaf01523_wl6S 

3. nDv.l.0.scaf09320_wl6S_l 

4. nDv.l.0.scaf09320_wl6S_2 



GTfGBAGlGlGlS: 



iGAAT'TSG^Gl^GGl 



G^GGGlVF^Gl 



'AG| 



■G|G| 



iGBf G| 



gttgcagagtgcaatccgaattgagatggcttttgagggattagcttagcctcgcgactttgcagcctattg 
gttgcagagtiacaatccgaaltgagatgiicttttgagg-attagcthagcctcgcgactttgcagcchattg 
gttgcagagtgcaatccgaattgagatMcttttgagggattagcttagcctcgcgactttgcagccHattg 
gttgcagagtgcaatccgalgtgagatggcttttgagggattagcttagcctcgcgactttgcagcclattg 



Consensus 

1. Dv_Moredun_wl6S 

2. nDv.l.0.scaf01523j 

3. nDv.l.0.scaf09320j 

4. nDv.l.0.scaf09320j 
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Figure 4. Comparison of Wo/bachia-Wke insertions from two Dictyocaulus viviparus isolates, and relationships of the Cameroon D. 

viviparous. A. 16S rRNA gene fragments from the Cameroon isolate of D. viviparus (obtained through whole genome sequencing) and from the 
Moredun isolate (from specific amplification) are shown aligned. The genome sequence assembly has three copies of Woibachia-Wke 16S genes, two 
tandemly arranged and truncated in scaffold scaf09320, and one in scaffold scaf01523. B. ftsZ gene fragments from the Cameroon isolate of D. 
viviparus (obtained through whole genome sequencing) and from the Moredun isolate (from specific amplification) are shown aligned. While we 
were able to amplify the complete fragment from the Moredun strain, the genome assembly contains only a truncated ftsZ gene (and no consensus is 
shown for the —200 bases of essentially unaligned sequence at the 5' end of the alignment). C. Bayesian phylogenetic analysis of the complete 
nuclear small subunit ribosomal RNA (nSSU) genes of the Cameroon D. viviparus and other Dictyocaulus sp., and outgroups (taken from the European 
Nucleotide Archive). The Cameroon D. viviparus is most similar to the European D. viviparus sequenced previously. RAxML analyses yielded the same 
topology. The 5' gene fragment isolated and sequenced from the Moredun strain was identical to the other D. viviparus nSSU sequences. 
doi:1 0.1 371 /journal.pgen.1 004397.g004 



collapse of dispersed repeat copies of the Wolbachia fragment. From 
these analyses we conclude that the Wolbachia-\\k<^ fragments are not 
from an unsuspected live Wolbachia infection of Z). viviparus, but are 
rather neutrally-evolving insertions of Wolbachia genome fragments 
into the nematode nuclear genome, and are relics of an ancient 
symbiosis, now lost. We have called the fragmented Wolbachia wDw, 
though, obviously, we have no evidence of an extant vdDw organism 
(and in fact regard it as being extinct). 

As a preliminary assessment of whether the insertions are 
restricted to some populations of D. viviparus (and thus that the 
symbiosis may have been recent and only in part of the species), or 
are more widespread (and thus likely to derive from more ancient 
symbiosis), we screened an independent D. viviparus isolate for 
presence of Wolbachia gene fragments. We performed directed 
PGR and Sanger sequencing of Wolbachia gene fragments from a 
D. viviparus isolate maintained at the Moredun Institute, Edin- 
burgh, isolated in Scotland in 2005. Both >^ and 16S rRNA 
fragments were amplified from this strain, and, when sequenced, 
were closely similar to the whole genome assembly-derived 
fragments, but differed by several substitutions (Figure 4 A, B). 
Comparison of the nuclear small subunit ribosomal RNA 
sequence from the assembly to those from Dictyocaulus species 
affirmed the species identification (Figure 4 C). We also screened 
the previous D. viviparus transcriptome assembly [40] for Wolbachia- 
like fragments and identified six transcribed fragments (Table 4) 
that were likely to be derived from Wolbachia, confirming presence 
of symbiont gene fragments in a third isolate. 

These transcribed Wolbachia-Y^kj^ fragments might offer evidence 
for functional integration of the remnants of the vdDw genome into 
the nuclear genome. We thus investigated each fragment for 
possible function. In four of five fragments deriving from protein- 
coding genes there were frameshifts and in-frame stop codons. 
None of the transcribed fragments showed evidence of splicing. 
One transcript, where the Wolbachia-Y\kQ sequence was in the likely 
3' UTR of a nematode gene (a homologue of C. elegans FRM-1), 
showed standard spliceosomal introns in the nematode-gene-like 
part, but the Wolbachia fragment itself was not spliced. Four of the 
transcript fragments were very short (500—600 bases, approxi- 
mately one 454 read length). 

Relationships of the Wolbachia of D. viviparus to other 
Wolbachia 

To identify the relationships of uiDv, sequences from the 
Wolbachia-like contigs were added to a five-gene supermatrix 
(including 16S rDNA, groEL, fts^ dnaA and coxA loci) used 
previously for phylogenetic analyses of Wolbachia [18]. This matrix 
does not include data from all 14 recognised Wolbachia clades, as 
sequencing in most has been limited. vdDw fragments correspond- 
ing to these genes were identified using BLAST and aligned with 
MUSCLE. We were not able to identify a dnaA gene in the D. 
viviparus assembly. We added to the alignment data from vuOo and 
available sequences from the Wolbachia from Radopholus similis 
{wR£). Both RAxML, MrBayes and PhyloBayes analyses suggested 



that wDw belongs to clade F, with strong branch support (Figure 5). 
The long terminal branch of vuDv compared to other Wolbachia in 
the same clade is likely to be a consequence of the accumulation of 
mutations in the wDv regions due to their insertion and 
subsequent neutral evolution in the nematode genome. vuOo was 
placed robustly within clade C as expected. Placement of wRs was 
less definite as it clustered as a sister taxon to clade D, but on a 
long branch with low support. We were unable to recover the 
published phylogeny [20] with iX'Rs arising basally to other 
Wolbachia, even when the matrix was analysed with vuDw excluded 
(data not shown), and thus this previous finding may be a 
methodological artifact. 

One genomic feature that distinguishes clade C and D Wolbachia 
is the absence of WO phage. WO phage are active temperate 
bacteriophage that are present in the sequenced clade A and B 
genomes, and that may mediate genetic transfer of key symbiosis 
genes between strains [51]. Using the 1363 protein sequences 
derived from WO phage available in the NGBI/ENA/DDBJ 
databases we identified 1 5 scaffolds in the D. viviparus genome that 
contained significant (BLAST E-values less than le-20) to WO 
phage proteins. These matches (Table 5) were to a wide range of 
WO phage genes, including capsid proteins, portal proteins, 
secretion system components, recombinases and others. In this 
genomic feature, wDv resembled A and B Wolbachia more than it 
did C and D. 

Discussion 

Fossils of Wolbachia infection reveal an unexpected 
palaeosymbiosis 

D. viviparus is the first nematode from the Rhabditina (the group 
that includes C. elegans and the important animal-parasitic 
Strongyloidea) that has been shown to have a relationship with 
Wolbachia. However, the Wolbachia sequences identified in the draft 
genome sequence do not appear to derive from a living organism, 
but rather show features suggestive of being ancient laterally 
transferred fragments of the genome of a clade F-like Wolbachia, 
which is now extinct. The insertions were not unique to the 
individual Cameroon nematode sampled, but were identified in 
another D. viviparus (from Scotland). Published and unpublished 
transcriptome data for D. viviparus include a very low level of 
fragments that mapped to Wolbachia-Yi^i^ regions of our assembly. 
We suggest that the lateral transfers may be found in aU D. 
viviparus, and that it will be exciting to survey additional 
Dictyocaulinae and related families within Strongyloidea for 
evidence of (palaeo-) symbiosis, and to better date the origin of 
the laterally-transferred fragments. 

Lateral transfers of Wolbachia DNA into the host nucleus, nuwts, 
have been identified previously in filarial nematodes and 
arthropods [26,52,53]. The evidence for the D. viviparus Wolba- 
chia-like sequences being ancient lateral transfers include their 
fragmentation, their interspersion with nematode sequence in 
robustly-assembled contigs, and their having inactivating muta- 
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^Anaplasma marginale 



fwb Armadillidium vulgare 
»wb Tribolium confusum 
»wb Encarsia formosa 
'""'*\wb Nasonia vitripennis (IV) 

^wb Culex quinquefasciatus (Pel) 

Folsomia Candida 



^Anaplasma phagocytophilum 
^^^^^^Ehrlichia ruminantium 




M/b Zootermopsis nevadensis 
wb Zootermopsis angusticollis 
wb Nasonia girauiti 
wb Drosophila simulans (R) 
\wb Nasonia vitripennis (LBII) 
wb Muscifidurax uniraptor 
^wb Nasonia longicornis 
wb Drosophila melanogaster 

>wb Litomosoides sigmodontis 
>wb Wuchereria bancrofti 
>wb Brugia pahangi 
wb Brugia malayi 
>wb Ctenocephalis felis 

wb Dirofilaria repens 
wb Dirofilaria immitis 

wb Dipetalonema gracile 
wb Onchocerca suzukii 
Onchocerca gibsoni 
wb Onchocerca volvulus 
wb Onchocerca ochengi 
wb Mansonella perforata 
wb Kalotermes flavicollis 
wb Coptotermes acinaciformis 

wb Dictyocaulus viviparus 
>wb Cimex lectularius 
wb Madathamugadia hiepei 
>wb Cercopithifilaria japonica 




Figure 5. Analysis of the phylogenetic relationships of the Wo/bach/a nuclear insertions in the Dictyocaulus viviparus genome. 

Phylogenetic tree inferred from 16S rDNA, groEL, ftsZ, dnaA and coxA loci with maximum likelihood (RAxML) and Bayesian (MrBayes, PhyloBayes) 
inference. Branch support is reported as (RaxML/MrBayes/PhyloBayes). Strains representing Wolbachia supergroups A, B, C, D, F and H are indicated. 
doi:1 0.1 371 /journal.pgen.1 004397.g005 



tions. Read coverage of the Wolbachia-like fragments varied 
greatly. If all the fragments derived from the genome of a live 
infection, it would be expected that they would have very similar 
coverage, as seen in other Wolbachia infected nematodes [37,54]. 
Fragments with very high read coverage are likely to be repeats 
(within the nematode genome). While about 1 Mb of contigs had 
matches to Wolbachia, these did not constitute a complete genome. 
Only —60% of the expected Wolbachia gene content was present 
(for example the dnaA gene was missing) and many genes and gene 
fragments were duplicated. Genome fragmentation and gene 
inactivation is suggestive of a long period of residence in the D. 
viviparus nuclear genome [25]. 

Do these Wolbachia-like but nuclear-encoded sequences have a 
current expressed function in D. viviparus? The majority of the 
potential protein-coding genes in the Wolbachia-like fragments 
contain insertions, deletions, frameshift mutations or nonsense 
codons compared to their homologues from living Wolbachia 
genomes. We identified only six Wolbachia-like transcript fragments 
in 61,134 transcripts assembled from 3 million D. viviparus 
transcriptome sequences [40]. Four of the transcript fragments 
were very short, about one 454 read length, and one Wolbachia-like 
match was in the 3 ' untranslated region of a bona fide nematode 
gene. Four of five fragments from protein-coding genes had 
frameshift and in-frame stop codon mutations, while the 16S 
rRNA fragment had a large deletion compared to 16S from living 
Wolbachia. On these bases it is unlikely these Wolbachia-derived 
sequences play roles in D. viviparus biology. 

This discovery suggests that all three suborders of the nematode 
order Rhabditida (Rhabditina, Tylenchina and Spirurina) have 
members whose genomes and biology have been shaped by 
symbioses with Wolbachia. In the well-studied clade C and D 
Wolbachia the relationship has features of mutualism [14]. The 
Wolbachia observed in R. similis is apparently live, as bacterial cells 



can be seen within host cells by microscopy [20], but there are 
currently no data on the nature of the symbiosis: its genome 
sequence is awaited with interest. In D. viviparus we have no 
positive evidence for live infection. Our analyses placed both wDv 
and wK.s close to clade F Wolbachia, and showed that clades C, D 
and F form a group distinct from clades A and B. From these and 
previous [18] analyses Clade F appears more "promiscuous" in its 
host relationships (its known hosts include both nematodes and 
arthropods). The symbiont biology of clade F is not well known: in 
Cimex, the clade F symbiont may be essential for fertility and 
nymphal development [55] but symbiont-host interactions remain 
unexplored elsewhere. We note that the presence of Wolbachia 
(albeit now extinct) in D. viviparus, a nematode that does not use an 
arthropod intermediate vector host, suggests that a simple model 
of nematode acquisition of Wolbachia from their vector arthropods 
is less likely. Clade F-like Wolbachia emerge as a credible source of 
the clade C and D Wolbachia of filarial nematode species. The wDv 
genome was likely to have contained WO phage [51], a mobile 
element present in clade A and B genomes but strikingly absent 
from clade C and D genomes. 

In this scenario, the genomic fossils of Wolbachia found in D. 
viviparus are evidence of infection of an F-like Wolbachia in a 
dictyocauline ancestor. We identified insertions in independent 
isolates of the parasite suggesting that the association was not 
limited to one subpopulation of D. viviparus. We note that there 
are Wolbachia-like sequences in transcriptome data from A. 
caninum, another strongyloidean nematode, and thus it is possible 
that Wolbachia infections may have been widespread in this 
group. While reports of Wolbachia in the strongyloidean 
Angio strong)) lus have been discounted [56,57], we are excited by 
the possibility that other palaeosymbioses, now extinct, may be 
revealed in forthcoming genome projects across the Nematoda 
and Metazoa. 
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Table 6. Analysis software versions and parameter settings. 





Software tool 


Reference 


Version 


Parameters used^ 


Comments 


FASTQC 


[60] 


vO.10.1 






fastq-mcf 


[61] 


ea-utils.1. 1.2-537 


-1 51 -q 20 -qual-mean 20 -R 




Blobology 


[37] 


2013-10-21 


default 




Khmer 


[62] 


khmer-1 7-05-201 3 


-k 20 -C 20 -p 




Velvet 


[63] 


1 .2.08 


-exp_cov auto -cov_cutoff auto 


Kmer length of 51 was used 


GapFlller 


[64] 


v1-1 1 


-0 10 -m 55 




clc_bio 


program used: clc_mapper 


4.1.0 


-1 0.9 -s 0.9 




BLAST 


[67] 


2.25 


default 




CEGMA 


[39] 


2.0 


default 




SNAP 


[76] 


2006-07-28 


default 


used within MAKER pipeline 


GeneMark 


[77] 


v.2.3e 


-BP OFF -max_nnn 500 - 
min_contig 10000 




MAKER2 


[41] 


2.25 


default 


maker_opts file changed 


Augustus 


[42] 


2.7 


script used: auto_Aug.pl 




orthoMCL 


[46] 


2.0.3 


default 




MUSCLE 


[78] 


3.8.31 


default 




RAxML 


[68] 


7.6.4 


-m GTRGAMMA 




MrBayes 


[69] 


3.2 


Iset nst= 6 rates = gamma 




PhyloBayes 


[70] 


2.3 


-cat -gtr 




FigTree 


[72] 


3.0.2 




used in construction of Figure 3 
C and Figure 4 


ITOL 


[71] 






used in construction of Figure 3 
C and Figure 4 


Geneious 


www.geneious.com 


R7 




used for construction of 
Figure 3 A, B 



* Unless otherwise specified, default parameters were used. 
doi:1 0.1 371/journal.pgen.1 004397.t006 



Finally, we provide a first draft assembly and annotation of the 
important nematode parasite D. viviparus. The identification of 
our specimen as D. viviparus is based on close identity of 
sequenced loci and the complete mitochondrial genome between 
our specimen and previously published D. viviparus data. As the 
specimen was destroyed during DNA extraction we no longer 



have a voucher for the individual. We note that there are very 
few records of Z). viviparus in sub-Saharan Africa, and it is typically 
described as a temperate species [58]. A very large abattoir 
survey in the Democratic Republic of Congo found only 3 
infected carcasses from 571 examined, and all of these were from 
cattle reared above 1,500 m (Ngaoundere is at 1,200 m) [59]. 



Table 7. PCR test for Wolbachia insertions. 



Primer F (name, sequence Primer R (name, sequence 5' Dictyocaulus Caenorhabditis Litomosoides Reference for 

Target gene 5' to 3 ) to 3 ) viviparus* elegans*' sigmodontis*' primers 



Wolbachia 16S 
rRNA 


Wspec16S_F1 


Wspec16S_R1 + 




+ 


[8] 




GAAGATAATGACGGTACTCAC 


GTCACTGATCCCACTTTAAATAAC 








Wolbachia ftsZ 


ftsZ_F1 


ftsZ_R1 + 




+ 


[8] 




ATYATGGARCATATAAARGATAG 


TCRAGYAATGGATTRGATAT 








nuclear 
nSSU 


F04 


R26 + 


+ 


+ 


[74] 




GCTTGTCTCAAAGATTAAGCC 


CATTCTTGGCAAATGCTTTCG 








mitochondrial 
coxl 


LCD 1490 


HC02198 + 


+ 


+ 


[75] 




GGTCAACAAATCATAAAGATATTGG 


TAAACTTCAGGGTGACCAAAAAATCA 









^ + strong positive band observed, and sequence confirmed; - no PCR product observed. All PCRs used New England BioLabs Phusion HP mix, an annealing 
temperature of 58 °C, 35 cycles of amplification, and were repeated twice with identical results. 
doi:1 0.1 371/journal.pgen.1 004397.t007 
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The genome and annotation can be used as a springboard for 
further analysis both investigating the Wolbachia-nermLtode 
interaction and also potential gene identification for drug and 
vaccine development. 

Materials and Methods 

Nematode isolation and genome sequencing 

A single Dictyocaulus vivipams male was isolated from Bos indicus 
(an individual of the local Gudali breed) in Ngaoundere abattoir, 
Adamawa Region in Cameroon by David Ekale and Vincent 
Tanya during the ongoing Enhancing Protective Immunity 
Against Filariasis EU- Africa programme. The nematode was 
frozen at — 80°C and shipped to Liverpool, UK, where DNA was 
extracted using the DNeasy Blood & Tissue Kit (Qiagen). 
Genomic sequencing was carried out by the Edinburgh Genomics 
Facility, using lUumina TruSeq library preparation reagents and a 
HiSeq 2500 instrument. A single 300 bp insert library was 
constructed, and 100 base paired-end data generated. Raw data 
have been submitted to the International Nucleotide Sequence 
Database Consortium under the project accession PEJEB5116 
(study ERP004482). 

Genome assembly and annotation 

All software tools used (including versioning and command line 
options used) are summarised in Table 6. The quality of lUumina 
reads was checked with FASTQC [60]. Raw reads were quality 
trimmed (base quality of 20), and paired reads were discarded if 
either pair was below 51 bases using fastq-mcf [61]. The trimmed 
reads were digitally normalised to ~20X coverage with khmer 
[62]. A draft assembly was generated using the normalised reads 
with Velvet [63] and gaps within scaffolds were filled using 
GapFiller [64] . Scaffold coverage was obtained by mapping all the 
reads back to the assembly using the clc-bio toolkit (CLC-Bio Ltd). 
Taxon-annotated GC% -coverage plots (TAGC plots) [37] were 
used to identify potential bovine and other contamination. Bovine 
contamination, which was minimal, was removed. 

A MAKER2-Augustus annotation pipeline was used to predict 
protein-coding genes from the genome [41]. The MAKER2 
program combines multiple ab initio and evidence-based gene 
predictors and predicts the most likely gene model. MAKER2 was 
run in a SGE cluster using the SNAP ab initio gene fmder trained 
by CEGMA [39] output models, GeneMark-ES ab initio finder, D. 
vivipams transcripts and SwissProt proteins. We used the MAKER2 
predictions to train Augustus [42] and create a gene fmder profile 
for D. vivipams. Using the gene finder profile, the assembled 
transcriptome [40] and available expressed sequence tag data [65] , 
Augustus was used alone to predict the final gene set, which was 
used for downstream analysis. Protein sets from selected nematode 
species, downloaded from Wormbase [66], were clustered using 
orthoMCL [46]. 

Analysis of Wolbachia-Wke fragments 

The Dictycaulus vivipams draft assembly was broken into 500 bp 
fragments and each fragment was compared to Bmgia malayi and 
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